Dct-based Video Features for Audio-v

نویسندگان

Martin Heckmann

Kristian Kroschel

Frédéric Berthommier

چکیده

Encouraged by the good performance of the DCT in audiovisual speech recognition [1], we investigate how the selection of the DCT coefficients influences the recognition scores in a hybridANN/HMM audio-visual speech recognition system on a continuous word recognition task with a vocabulary of 30 numbers. Three sets of coefficients, based on the mean energy, the variance and the variance relative to the mean value, were chosen. The performance of these coefficients is evaluated in a video only and an audio-visual recognition scenario with varying Signal to Noise Ratios (SNR). The audio-visual tests are performed with 5 types of additional noise at 12 SNR values each. Furthermore the results of the DCT based recognition are compared to those obtained via chroma-keyed geometric lip features [2]. In order to achieve this comparison, a second audio-visual database without chroma-key has been recorded. This database has similar content but a different speaker.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DCT-based video features for audio-visual speech recognition

متن کامل

Neural Network Performance Analysis for Real Time Hand Gesture Tracking Based on Hu Moment and Hybrid Features

This paper presents a comparison study between the multilayer perceptron (MLP) and radial basis function (RBF) neural networks with supervised learning and back propagation algorithm to track hand gestures. Both networks have two output classes which are hand and face. Skin is detected by a regional based algorithm in the image, and then networks are applied on video sequences frame by frame in...

متن کامل

آشکارسازی و تعیین مکان متون فارسی - عربی در تصاویر ویدیویی

Video text detection plays an important role in applications such as semantic-based video analysis, text information retrieval, archiving and so on. In this paper, we propose a Farsi/Arabic text detection approach. First, with an appropriate edge detector, edges are extracted and then by using edges cross ponts, artificial corners are extracted. Artificial corner histogram analysis is done for ...

متن کامل

Edge-based semantic classification of sports video sequences

This paper presents an edge-based semantic classification of sports video sequences. The paper presents an algorithm for edge detection, and illustrates the usage of edges for semantic analysis of video content. We first propose an algorithm for detecting edges within video frames directly on the MPEG format without a decompression process. The algorithm is based on a spatial-domain synthetic e...

متن کامل

Audio-Video based Classification using SVM and AANN

This paper presents a method to classify audio-video data into one of five classes: advertisement, cartoon, news, movie and songs. Automatic audio-video classification is very useful to audio-video indexing, content based audio-video retrieval. Mel frequency cepstral coefficients are used to characterize the audio data. The color histogram features extracted from the images in the video clips a...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

Dct-based Video Features for Audio-v

نویسندگان

چکیده

منابع مشابه

DCT-based video features for audio-visual speech recognition

Neural Network Performance Analysis for Real Time Hand Gesture Tracking Based on Hu Moment and Hybrid Features

آشکارسازی و تعیین مکان متون فارسی - عربی در تصاویر ویدیویی

Edge-based semantic classification of sports video sequences

Audio-Video based Classification using SVM and AANN

عنوان ژورنال:

اشتراک گذاری